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ABSTRACT 



We demonstrate through numerical simulations with real data the feasibility of using compressive sensing techniques for the ac- 
quisition of spectro-polarimetric data. This allows us to combine the measurement and the compression process into one consistent 
framework. Signals are recovered thanks to a sparse reconstruction scheme from projections of the signal of interest onto appropri- 
ately chosen vectors, typically noise-like vectors. The compressibility properties of spectral lines are analyzed in detail. The results 
shown in this paper demonstrate that, thanks to the compressibility properties of spectral lines, it is feasible to reconstruct the signals 
using only a small fraction of the information that is measured nowadays. We investigate in depth the quality of the reconstruction as a 
function of the amount of data measured and the influence of noise. This change of paradigm also allows us to define new instrumental 
strategies and to propose modifications to existing instruments in order to take advantage of compressive sensing techniques. 
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1. Introduction 

Our present knowledge of the physical and magnetic proper- 
ties of solar and stellar plasmas owns a debt to the rapid de- 
velopment of spectro-polarimeters in the last decades. These 
instruments use dispersive elements for the spectral analysis. 
Because visible/infrared detectors are only sensitive to the in- 
tensity of light, modulators are used to encode the polarimet- 
ric information on intensity variations. Since most detectors are 
nowadays two-dimensional, the recovery of two-dimensional 
spectro-polarimetric information is carried out using scanning 
techniques: spatial in the case of long-slit spectro-polarimeters 
and spectral in the case of Fabry-Perot-like spectro-polari meters. 
Acc ording to the Nyquist-Shannon sampling theorem (Nvquis^ 
11928 ; Shannon & Weaver 1949), the correct sampling of a band- 
limited signal should be done at a rate equal to twice the band- 
width. In other words, one should sample each resolution ele- 
ment (spectral and/or spatial) with two pixels at least in order 
to be assure that all the frequencies in the bandwidth can be ob- 
served. This theorem has been applied with faith during the last 
half century but one should note that it is only valid for band- 
limited signals. 

During the la st few years, the emerging theory o f compres- 
sive sensing (CS ; ' Candes et al.l2006bt |Donoho 2006) is showing 
that this sampling is indeed too restrictive when some details of 
the signal structure are known in advance. The interesting point 
of the new CS paradigm is that, in many instances, natural sig- 
nals have a structure that is known in advance. For instance, stel- 
lar oscillations can be represented by sinusoidal functions of dif- 
ferent frequencies, images can be represented in a multiresolu- 
tion analysis using wavelets, etc. The key point is that, typically, 
only few elements of the basis set in which we develop the signal 
are necessary for an accurate description of the important physi- 
cal information. The innovative character of CS is that this com- 
pressibility of the observed signals is inherently taken into ac- 



count in the measurement step, and not only in the post-analysis, 
thus leading to efficient measurement protocols. Instead of mea- 
suring the full signal (wavelength variation of the Stokes profiles 
in our case), under the CS framework, one measures a few linear 
projections of the signal along some vectors known in advance 
and reconstructs the signal solving a non-linear problem. 

A quick review of the literature shows us that signals 
arising in natural phenomena are typically compressible (see 
JPEGQ compression and wavelet, curvelet or ridgelet decompo- 
sitions of images, among many othe rs). Since this is also the 
case for astronorni cal data (e.g., Miihlmann & Hanslmei ej 19961; 
Fligge & Solankill 997; Belmon et al. 2002; Polvgiannak is et all 
20031; iDollet et alj 12004^ .Bernas et ah ,2004) . it has been sug- 
gested that CS can be used to all eviate telemetry pr oblems with 
space telescopes like Herschel ( iBobin et alj 12008) or Cassini 
( Belmon et al. 2002.) and to improve existin g techniques for th e 
reconstruction of radio interferometric data dWiaux et al ] l2009h . 
More specifically, several works also demonstrate that this is 
also true in the field of polarimetry jSo cas-Navarro et al. 2001; 
Lopez Ariste & Casini 2002|; ISkumanich & Lopez Ariste 2002; 
Casini et al.. ,2005). Following this idea. i Lites et al.. (,200Z) have 
analyzed the effectiveness of using JPEG compression for re- 
ducing the amount of data that needs to be transferred through 
telemetry for the Hinode space telescope. This option is cur- 
rently in use in the mission (see Tsuneta et al.l i2008). All these 
results and those of the analysis we carry out in this paper prompt 
the interest of investigating the appropriateness of employing 
CS ideas to measure spectro-polarimetric signals, specifically in 
space telescopes but also for ground-based telescopes. Through 
the use of these techniques, we anticipate an enhanced perfor- 
mance in terms of de-noising and data acquisition rates which 
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' The name JPEG stands for Joint Photographic Experts Group, and 
is a lossy compression format for images based on the application of 
sparsity-enhancing linear transformations. 
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eventually may have an impact on the choice of detector tech- 
nologies and data transfer. 

The outline of the paper is the following. Section |2] gives a 
brief description of the CS paradigm, showing how signals can 
be recovered from a few linear projections and the properties 
that such projections need to fulfill. An analysis of the compress- 
ibility of spectro-polarimetric signals of interest is shown in |3j 
SectionlUpresents recovery examples, with an analysis of the in- 
fluence of noise. Finally, ^ shows novel instrumental strategies 
based on CS ideas, while the conclusions are presented in ^ 

2. Compressive sensing 

2.1. Theoretical considerations 

Because of the innovative character of CS, we give a brief de- 
scription of the most important points, while more mathematical 
details are discussed in Appendix lAl For a more in- depth de- 
scription, we refer the reader t o recent references (e.g.. lBaraniukl 
l2007HCandes & WakinlllOOSl and references therein). 

The usage of compressive sensing techniques for the mea- 
surement of a signal x' represented as a vector of length M is 
based on the following two key ideas: 

- Instead of measuring the signal itself, one measures the 
scalar product of the signal with carefully selected vectors: 



y - Ox' + e. 



(1) 



where y is the vector of measurements of dimension A^, O is 
an NxM sensing matrix and e is a vector of dimension A^ that 
characterizes the noise on the measurement process. Note 
that the previous equation describes the most general linear 
multiplexing scheme in which the number of measurements 
M and the length of the signal A^ may differ. In the standard 
multiplexing case, the number of scalar products measured 
equals the dimension of the signal (A^ - M). Consequently, 
it is possible to recover the vector x' provided that rank(O) - 
N, so that the problem is not ill-conditioned. In other words, 
one has to verify that every row of the O matrix is orthogonal 
with respect to every other row. 

The second key ingredient of CS is the assumption that the 
signal of interest is sparse in a certain basis set (or can be 
efficiently compressed in this basis set). Any compressible 
signaQ can be written, in general, in the following way: 



(2) 



where x is a /T-spars^ vector of size M and is the trans- 
pose of a M X M transformation matrix associated with the 
basis set in which the signal is sparse. For instance, W can be 
the Fourier matrix if the signal x' is the combination of a few 
sinusoidal components. Other transformations of interest are 
the wavelet matrices or even empirical transformation matri- 
ces like those found using principal component analysis. 

The combination of the those ingredients leads to the multi- 
plexing scheme: 



y - OW X + e. 



(3) 



^ A signal is said to be compressible (or quasi-sparse) if it is possible 
to find a basis for wiiich the projection coefficients along tlie vectors 
of the basis reordered in decreasing magnitude decay in absolute value 
like a power-law. 

^ A vector is A'-sparse if only K elements of the vector are different 
from zero. 
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Fig. 1. Test showing how compressive sensing works for detect- 
ing a sparse signal. The upper panel shows a sparse signal made 
of four spikes of very short duration. The lower panel presents 
24 measurements built as the scalar product of the signal with 
Gaussian random noise. The stars show the reconstructed signal, 
showing that perfect recovery is possible with only a very small 
fraction of the measurements- 



with the hypothesis that x is s parse, whic h renders CS feasi- 
ble. It has been demonstrated bv lCandes et al. (2006b) that, even 
if rank(OW^) <g: N (we have much fewer equations than un- 
knowns), the signal x can be recovered with overwhelming prob- 
ability when using appropriately chosen sensing matrices O. 

When the number of equations is less than the number of un- 
knowns, it is usual to solve Eq. © using least-squares methods 
that try to minimize the £2 norrrQ of the residual. This is usu- 
ally accomplished using techniques based on the singular value 
decomposition (see, e.g.. lPress et al.|[T986h . However, such min- 
imization is known to return non-sparse results (e.g., iRomberj 
2008, and references therein). A more appropriate solution to 
Eq. (O is to look for the vector with the smallest /'o pseudo-norm 
(the num ber of non-zero elem ents of the vector) that fulfills the 
equation jCandes et al.ll2006bl) : 



min II X llo subject to || y ■ 



OW^x 



|2< e. 



(4) 



where e is an appropriately small quantity. The solution of the 
previous p roblem is, in genera l, not computationally feasible. 
However, dCandes et al.ll2006bllah demonstrated that, if the ma- 
trix OW^ fulfills ce rtain conditions described in Appendix lAl 
(ICandes et al.ll2006ah . the problem 



min II X III subject to || y - OW^x ||2< e. 



(5) 



is equivalent to that of Eq. (|4]i. The advantage is that very effi- 
cient numerical methods exist for the solution of such problem 
(the one used in this paper is described in AppendixlAli. 

Figure [T] shows a very simple example that summarizes the 
essence of compressive sensing. The upper panel presents a sig- 
nal (solid line) that is 4-sparse in the basis of Dirac delta func- 
tions. Instead of measuring the full length of the signal, the lower 
panel of the figure shows a very small amount of scalar products 
of the signal with a Gaussian random matrix. The stars present 
the reconstructed signal using only 24 such measurements. Since 
this is a noiseless example, perfect reconstruction is obtained. 



The {„ norm of a vector is given by || x ||„= (2, U,|")''" if n > 0. 
The Cq pseudo-norm is given by the number of non-zero elements of x. 
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2.2. Practical considerations 

The CS framework offers several advantages over standard mea- 
surement paradigms. On the one hand, since one only measures 
linear combinations of the signals, the flux of information that 
any sensor has to deal with is usually much smaller (thanks to 
compression). This is probably of secondary interest for ground- 
based instruments since the infrastructure to cope with such large 
fluxes of data is available. However, this could be of interest 
for space-borne instruments, where the flux of data is limited 
by telemetry. As a sub-product of the simplification on the mea- 
surement, the reconstruction of the signal is much more time 
consuming, but can be done efficiently a-posteriori without af- 
fecting the measurement process. On the other hand, if appro- 
priate sensing matrices are chosen, the measurement process can 
be considered universal and does not depend on the exact basis 
set in which the data is sparse. In other words, one first measures 
projections and this assures that the data can be reconstructed 
a-posteriori if the data is sparse in any (unknown a-priori) basis 
set. 

Ground-based instruments may draw advantages from the in- 
creased cadence at which data is acquired. An instrument mea- 
suring spatial and spectral information with a 2D detector is 
forced to scan one of the extra dimensions of the data space, ei- 
ther the spectra in filter instruments or one of the spatial dimen- 
sions in spectrograph instruments. The CS framework may di- 
minish the number of measurements in the spectrum space thus 
shortening the time in which the 3D data cube is acquired by the 
instruments. Boosting data acquisition rates is always advanta- 
geous from both the point of view of a time evolving observa- 
tional target (as solar structures) or from the point of view of de- 
formations or aberrations introduced by atmospheric turbulence 
(seeing). 

3. Compressibility of the signals 

As reported in ^ any signal amenable to compressive sensing 
has to be sparse or compressible in some basis set. The pur- 
pose of this sectio n is to test to what ex tent polarimetric signals 
are compressible (Asensio Ramos et al. 2007). We focus mainly 
on linear and circular polarization profiles, although our results 
can be extended to standard spectroscopic observations with- 
out efl'ort. We present results for signals produced by scatter- 
ing processes and for signals produced by the Zeeman effect un- 
der the presence of a magnetic field. Concerning the basis set in 
which the signals are sparse, we focus on the wavelet family for 
the case of scatteri ng polarizatio n and on the principal compo- 
nent analysis (PCA: lLoevell 955') decomposition for the Zeeman 
case. These are just two possible candidates and we want to 
stress that it is advantageous to analyze each case in detail in 
order to find the basis set in which the signal is as sparse as pos- 
sible. 



3. 1 . Sparsity of tine Second Solar Spectrum 

The wavelength variation of the fractional linear polarization 
Q/I measured very close to the solar limb is nowadays usually 
known as the second solar spectrum. Its name comes from the 
large wavelength variability, in some sense c omparable to the 
standard Fra unhoffer intensity spectrum (see iGandorfeJ l2000i 
I2OO2I l2005h . Examples of the variability are shown in Fig. |2] 
where many of the peaks detected correspond to specific spec- 
tral lines. Certain lines produce very conspicuous signals like the 
neutral sodium doublet shown in the middle panel of Fig.|2] 



It is apparent that the spectrum of Q/I cannot be, in gen- 
eral, considered to be sparse in the wavelength domain because 
it is composed of broad peaks with a large wavelength variabil- 
ity. However, driven by the typical shape of the line profiles, 
we analyze the sparsity properties of the second solar spectrum 
when decomposed on the wavelet domain. To this end, we se- 
lect standard wavelet mothers that are widely used in other ap- 
plication (e.g., Jensen & la Cour-Harbo 2001): the Daubechies 
family, the Coiflet family and the Haar family. The discontin- 
uous Haar family is appropriate for decomposing pixel-based 
data. The other families produce smoother approximation to the 
data. It is left for the future to analyze the potential of other 
families, especiaUy non-ort hogonal redundant wavelets (e.g.. 



IStarck et 



es pecial ] 
al.lll997t 



(Idel Toro Iniesta & Lopez Aristeii2003l 
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Fhgge & S olanki 1 997h or Hermite functions 



We carry out an experiment for characterizing the compress- 
ibility of the second solar spectrum. We select a large piece of 
the spectrum in which many small signals are present, together 
with a strong signal produced by the D2 line of Ba 11. The spec- 
trum is shown in black solid lines in Fig. |2] The full spectrum 
is wavelet-decomposed (the wavelet of choice is shown in each 
panel) and thresholded so that only a certain number of wavelet 
coefficients survive, while the rest of coefficients are set to zero. 
This is an efficient way of compressing the signal provided that 
the thresholding fundamentally cancels noise and leaves the sig- 
nal unperturbed. The upper left panel of Fig. [2] shows what 
happens when only 10% of the wavelet coefficients are main- 
tained, while the right panel indicates the behavior after setting 
to zero 98% of the coefficients. Since the zero coefficients are 
not necessary in the reconstruction, this thresholding leads to 
an important compression of the signal. The signal is then re- 
constructed using the inverse wavelet transform. We note that 
even in the case of only 2% of the coefficients, the important 
signals are nicely recovered while the noisy part of the spec- 
trum is largely reduced. Apparent from the figure is the fact that 
the behavior is very similar for all the wavelet families we have 
tested, although the computing times are different, being larger 
for wavelets with a larg er number of non-vanishing moments 
(e.g., I Jensen & la Cour-Harbo .2001,) . This can be an issue that 
should be taken into account depending on the balance between 
the desired smoothness of the reconstruction and the computing 
time. 

Other examples are shown in the middle and lower panels of 
Fig. 121 for the case of the Na i doublet at 5890 Aand the Sr i Hne 
at 4607 A, respectively. The first one presents a case in which 
low-frequency (the large scale quantum interference between the 
two lines of the doublet) and high-frequency information (the 
large variability of the profiles close to the core of the line) 
coexist. This poses an interesting problem to any compression 
method because it has to retain low- and high-frequencies si- 
multaneously. Apparently, the wavelet compression does a good 
job on this multiplet and all important details can be retrieved 
even with only 2% of the coefficients. The lower panels of Fig. 
|2]show the case of the Sr i line at 4607 A, which is a very strong 
signal embedded in a quasi-flat continuum. In this case, recon- 
structing only with 2% of the coefficients gives a bad represen- 
tation of the true underlying signal. Some ripples appear when 
using Daubechies and Coiflet wavelets on the quasi-continuum, 
although the amplitude of the signal is still correctly recovered. 
The reconstruction with the Haar wavelet gives a very good rep- 
resentation of the Sr i line but the depolarizations in the red wing 
of the line and at 4606.3 A are not correctly recovered. The re- 
construction with 10% of the coefficients is almost perfect. 
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We have measured the quahty of the reconstruction using the 
68% and 95 % percentiles of the difference between the original 
and the reconstructed signal. The value of these quantities ver- 
sus the percentage of remaining non-zero coefficients is shown 
in Fig. [3] The left panel has been obtained with the data from 
the Ba ii D2 line, while the right panel is associated with the 
Na I data. We have verified that the distribution of differences is 
close to normal except in the cases in which the reconstruction 
is done with too few coefficients. Therefore, the 68% and 95% 
percentiles are close to the standard deviation and twice the stan- 
dard deviation, respectively. The lines without symbols present 
the 68% percentile, showing that differences are in both cases 
below 10"^ when retaining just 10% of the profiles. For Qjl sig- 
nals that are on average at the level of ~ 0. 1 , we find that relative 
errors are typically below lO"-'. The 95% percentile gives rela- 
tive errors slightly above 10"^ for these cases. The results tend 
to indicate that differences among wavelet families are relatively 
small. 



3.2. Sparsity of Zeeman signals 



Under the presence of a sufficiently strong magnetic field, 
the Zeeman effect usually controls the emergent observed po- 
larization. In order to test for the compressibility properties 
in the Zeeman-dominated case, we use a dataset obtained 
with the Solar O ptical Telescop e/Spectro-Polarim eter SOT/SP 
dLites et al. 2001) sboaxd Hinode (Kosugi et al. 2007). The prop- 
erties of this dataset are typical of what one should encounter in 
the future if the CS techniques that we propose here are applied 
to future space missions. In this case, we test for two different 
basis set for compressibility in Fig. |4] The first is the universal 
Daubechies-8 wavelet (left panel). The results we show are not 
very sensitive to the specific chosen wavelet and are represen- 
tative of the general behavior. The second is the empirical basis 
set obtained using the PCA decomposition (right panel). 

The results, that present the 68% and 95% percentiles of 
the distribution of the difference between the exact and the re- 
constructed profiles, indicate clearly that signals are again com- 
pressible. The best results are, obviously, obtained with the PCA 
basis set, because the eigenvectors are empirically constructed 
to maximize the sparsity of the signal (only a few eigenvectors 
are necessary to reconstruct the signal without noise). 

The main problem with the PCA basis set is that it is 
obtained empirically. Consequently, strictly speaking, one is 
not able to use the PCA basis set in a CS framework be- 
c ause a-priori the basis set is not k nown. However, according 
to lSkumanich & Lopez Arist^ (l2002h . some universality proper- 
ties of the PCA eigenvectors can be demonstrated when many 
profiles are included in a database. For this reason, we have 
tested that profiles observed with Hinode can be nicely com- 
pressed with the eigenvectors recovered from a completely dif- 
ferent dataset. The reason is that the same physical effects are 
controling the signals in both datasets. This opens the possibil- 
ity of using some kind of universal PCA basis for compressing 
Zeeman-dominated data. This basis set will surely contain de- 
tails of the spectral lines that are present in the majority of the 
observed profiles and that can be hardly recovered with fixed 
basis sets like wavelets. 



4. Signal recovery 



4.1. Examples 



Since we have demonstrated that the polarimetric signals are 
compressible, the CS framework can be used to measure such 
signals. We give here a few examples using different sensing 
matrices. The first ones are shown in the left panel of Fig. |5] 
We present the reconstruction of the Na i and Sr i signals ana- 
lyzed in ^with two different sensing matrices: (i) a Gaussian 
matrix with elements extracted from the A^(0, l/K) distribution 
(Candesetal. 2006b), with K being the sparsity of the signal 
and (ii) a binary sparse se nsing matrix with only 10 non-zero 
elements per measurement ( Berinde & Indvkll2008l) . Obviously, 
the binary sparse matrix has two advantages over the Gaussian 
matrix. First, the number of non-zero elements is very small as 
compared to the size of the matrix and effici ent sparse storag e 
and computational methods can be used (e.g.. lPress et alJ[T986l) . 
In our case, the sparse matrix contains less than 2% of the ele- 
ments different from zero. Second, the binary matrix is easier to 
implement on hardware using, for instance, micro-mirrors. Only 
5% of the elements of the solution vector are allowed to be non- 
zero for the case of Na i and 10% for the case of Sr i, according 
to the results presented in Fig.|2] The number of measurements 
used is 6K for the Sr i line and for the Na i doublet, roughly 
in accordance with Eq. ( IA.6b . while the reconstruction is done 
using the Daubechies-8 wavelet. The results show that a good 
recovery is possible in the two cases, with the advantage that, 
since sparsity is inherent to the reconstruction, noise is largely 
reduced in the reconstruction. In order to show how the tech- 
nique behaves with the number of measurements, we show in 
Fig. |6]the standard deviation of the difference between the re- 
constructed and original signal versus the number of measure- 
ments (normalized to the sparsity of the vector). The horizontal 
lines indicate an estimation of the noise level in the observations 
obtained as the standard deviation of a portion of the contin- 
uum. Note that, when the number of measurements is not large 
enough, the reconstruction does not work. However, as soon as 
condition (IA.6I 1 is fulfilled, reconstruction works properly. 

Other examples are shown on the right panel of Fig. |5] Stokes 
V profiles picked at two positions of an observation carried out 
with Hinode on February 27, 2007. The Fe i doublet at 630 
nm with amplitudes typical of active regions (upper panel) and 
quiet Sun (lower panel) are shown in black lines. The recon- 
structed signals using the same sensing matrices as above are 
shown in red and blue. The universal PCA basis is used and only 
11 of such eigenvectors are used, roughly 10% of the full ba- 
sis set. The number of measurements is 6K, in accordance with 
Eq. ( |A.6t . A good recovery is possible even for profiles whose 
amplitude is close to the noise level. The prior information en- 
coded in the sparse reconstruction produces that noise is slightly 
reduced with respect to the original profile. This is similar to 
what one would find after carrying out a PCA filtering of the 
data, but th e filtering is encoded inside the measurement tech- 
nique (e.g.. i Martmez Gonzalez et"aLll2008b ). The fundamental 
reason for this is that, while true signals produce sparse signa- 
tures, noise destroys the sparsity to some degree. Since the re- 
construction is done enhancing sparsity, it is not possible to re- 
cover noise and a filtering is carried out as a side effect of the 
reconstruction. 
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4.2. Sensitivity to noise 

To test the influence of noise on our ability to recover the spec- 
trum from the linear combinations given by the sensing matrix, 
we analyze a synthetic case. We have chosen the spectral line 
at 6302 A for its widespread use. A very simplistic Stokes / 
profile is built using a Voigt function tweaking the width and 
depth to fit the average profile of the solar atlas dWallace et al.l 
119981) . The Stokes V profile emerging from a magnetized atmo- 
sphere is built under the weak-field approximation, in which it 
is proportional to the wavelength der ivative of Stokes / (e.g., 
iLandi DegFInnocenti & Landolfil l2004i) . We used a magnetic 
flux density of 1 Mx/cm"'^, resulting in an amplitude of 4x10 '* 
in units of the continuum intensity. The small value of the mag- 
netic flux density is used intentionally so that the observation 
is close to the detection limit of present spectro-polarimeters on 
relatively short exposure times. We simulate observations with 
noise added and we use a Gaussian random sensing matrix with 
zero mean and inverse variance equal to the assumed sparsity. 
We carry out experiments, shown in Fig.|7] using Daubechies-8 
wavelets assuming that only 12 elements of the recovered vector 
are non-zero (left panel) and a universal PCA basis set assuming 
that the sparsity of the vector is 5 (right panel). The PCA basis 
is obtained from a dataset observed with Hinode. The number of 
measurements is set to six times the sparsity level in each case. 
The solid line in each plot indicates the standard deviation of the 
difference between the exact profile and the reconstructed one 
for each value of the noise level. The dashed line is the same 
quantity but calculated only for the noise. This curve is used to 
give an idea of the de-noising abilities of the decomposition on a 
wavelet/PCA basis. When the signal-to-noise ratio (SNR) is very 
poor, in both experiments we see that a reduction of almost an 
order of magnitude in the noise, clearly stating that the signal is 
well below the noise level. Only when the SNR approaches ~0.4, 
we see that the CS detects signal and gives a much better be- 
havior than just the direct measure of the profile using standard 
techniques. The reason has to be found on the fact that sparsity 
is promoting the detection of signals contrary to the detection of 
noise. Noise is not sparse in any of the used basis sets and the 
reconstruction is made assuming sparsity on the solution. This 
is an important information that we are including as a prior on 
the CS recovery, something that is not done in standard measure- 
ments. 



CS applies to the wavelength variation of a single Stokes pa- 
rameter. Therefore, under the assumption that the polarimetric 
modulation is achromatic over the observed spectral range, their 
effect can be interchanged and one ends up with solving four 
problems of the kind: 

y,. = OW^X; + e, (7) 

where x, is the de modulated wavelength variation o f the i- 
th Stokes parameter ddel Toro Iniesta & Colladosll2000h and we 
have assumed that the same CS matrix is used for all the mod- 
ulation states of the polarimeter. Consequently, the procedure to 
follow is to obtain the linear measurements for each position 
of the polarimet er, thus leading to a set of NNmod m easurements. 
Then, following ldel Toro Iniesta & Colladosl ( l2000l) . the pseudo- 
inverse of the modulation matrix O is applied to the A^mod mod- 
ulation states of each linear measurements. At the end, the four 
CS problems of Eq. (Q are solved. 

4.4. Fringes and otiier spurious signais 

Among others, non-polarized and polarized fringes are undesir- 
able contamin ations presen t in many spectro-polarimetric obser- 
vations (e.g., Semell l2003h . Observations are partially cleaned 
from these fringes using flat-fielding techniques. The remain- 
ing fringes are filtered out at the end of the reduction process 
with the disadvantage of having a large subjective component. 
Under the compressive sensing scheme these spurious signals 
are measured together with the real signal. It is a matter of the 
reconstruction to avoid introducing them into the final result, be 
it rejecting or reconstructing them together with the true signal. 
Obviously, the ideal situation is to employ a fringe-free spectro- 
polarimeter to obtain a better reconstruction. We defer the deep 
investigation of this issue to a later study. However, we want to 
point out that preliminary experiments indicate that the recon- 
struction can efficiently reduce the amplitude of periodic fringes 
if the basis set used is not able to reproduce them. The draw- 
back is that the ensuing reconstruction is less accurate than in 
the absence of fringes. Such a situation arises when applying 
a PCA basis set that is able to efficiently describe the spectro- 
polarimetric signals but not the periodic fringes. Another possi- 
bility of investigation is to assume that the fringes are sparse in 



4.3. CS witti poiarimetric moduiation 

Since existing instruments are not directly sensitive to polariza- 
tion in the optical and infrared spectral domains, it is custom- 
ary to use modulation schemes for measuring the Stokes pa- 
rameters. In some sense, modulation is another form of multi- 
plexing which is carried out using retarders and polarizers. The 
monochromatic Stokes vector entering the telescope S is modu- 
lated A^mod times for generating intensities that are linear combi- 
nations of the Stokes parameters of S: 



0.08 [ 



1° 



OS, 



(6) 



Sr I 
Na I 



0.04 



0.02 - 



0.00 L 



where each row of the A^mod x 4 matrix O equals the first row of 
the Mueller matrix of each modulation state. 

The obvious question is whether it is possible to apply CS 
for compressing the wavelength information while still carry- 
ing out the modulation for detecting the four Stokes parameters. 
The answer is that it is possible since both multiplexing opera- 
tions "act" on different spaces: polarimetric modulation applies The reconstruction is done using the sparse binary matrix and 
to monochromatic Stokes parameters and the sensing matrix of the Daubechies-8 wavelet. 



2 4 6 8 10 12 

Number of measurements [normalized to sparsity] 

Fig. 6. Standard deviation of the difference between the recovery 
and the original signal for different number of measurements. 
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the Fourier domain, carrying out the reconstruction merging the 
Fourier basis set and the basis set for the signal together. 



5. Applications 

5.1. Efficient Spectro-imagers 

Rec ent interest in Hadamar d techniques for spectro-polarimetry 
(see lHarwit & Sloanelll979l for more details) can greatly benefit 
from CS techniques. Hadamard techniques allow to condensate 
spectral information inside single detector pixels through mul- 
tiplexing. Before going into the use of these techniques in the 
framework of CS, it is advisable to describe these Hadamard 
techniques in spectro-polarimetry and the advantages they bring 
up through two illustrative examples. In traditional spectroscopy 
a certain amount of pixels (often a full dimension of a detec- 
tor array) are dedicated to measure intensities at different wave- 
lengths for the same point in an image. In Hadamard techniques, 
that is substituted by a temporal modulation over the Hadamard 
cyclic mask in a single detector pixel. Such exchange has two 
interesting applications: in long-slit spectroscopy, the spectrum 
can be multiplexed in a single pixel and instead of using a 2- 
dimensional detector array one can use a one dimensional array 
with increased acquisition cadences that allow for seeing freez- 
ing during the modulation cycle of polarimetry with the result- 
ing improve ment on polarimetric sensitivity (see, e.g., ZIMPOL; 
|Poveill200TI) . ____ 

In double-pass substractive spectroscopy (lMeinl2002l) the re- 
sulting image has been filtered by a narrow spectral slit that se- 
lects a single wavelength per pixel. Due to the dispersion of the 
first pass over the diffraction grating, the selected wavelength 
changes as one moves over the spatial image. To reconstruct the 
full 3D data cube with both spatial dimensions and spectral cov- 
ering, every pixel has to be scanned over a range of wavelengths. 
Through the use of Hadamard cyclic matrices one can have sev- 
eral wavelengths sampled simultaneously in every pixel. As a re- 
sult the temporal coherence of the recovered spectra is increased 
as several wavelengths over the spectral domain are detected si- 
multaneously. Also, as a side effect, the raw images do not show 
any evident spectral features, what makes them more suitable for 
image reconstruction techniques 

Such applications of Hadamard techniques to spectro- 
polarimetry are however hindered by the known fact that, in 
the presence of a multiplicative noise as photonic noise, the 
Hadamard transformation results in a reduction of SNRs with 
resp ect to the case of equivalent exposures without multiplex- 
ing dHarwit & Sloanei ri979). The reduction in th e SNR can b e 
limited with the use of appropriate binary masks (Wuttig 2005|), 
although it is never too large for the usual cases in solar spectro- 
polarimetry. 

Compressive sensing can help mitigate the problem. Since 
the Hadamard technique is applied to the spectral information, 
one can make use of the fact that there is prior information 
about the spectrum to be measured and that, in consequence, the 
space of spectral profiles (with polarization included) is sparse, 
as demonstrated in the previous sections of this work. The use 
of the CS techniques illustrated above allows the recovery of 
the full spectral information with just a few spectral measure- 
ments. In the language of Hadamard techniques, this translates 
into the fact that not all the data acquisitions attached to the 
cyclic Hadamard masks are required for the recovery of the spec- 
trum. If traditionally a Hadamard mask of dimension would 
require A^ cyclic measurements to solve the multiplexing linear 
system, CS techniques may be used to solve the system with 



just MK acquisitions, where K ^ N is the sparsity of the sig- 
nal (perhaps K ~ O.IA^ at most) and M > 1 is a small number. 
If each exposure lasts for fexp, the reduction in the required time 
for spectral information retrieval (from A^x fgxp to MKxts^p) can 
then be used, not only to accelerate the full process of measure- 
ment, but also to repeat N/(MK) times the same measurement 
and add them to gain a factor -\/N/ (MK) in signal to noise ra- 
tio. From the tests of previous sections we can conclude that a 
figure of K - Q.IN is sufficient for the seeked precisions of the 
measurement. With a factor M ~ 5, the repetition of the mea- 
surements with a reduced Hadamard cycle can be used to gain 
roughly a factor y/lO/M ~ 1.4 in SNR. Such a factor would 
largely compensate the loss of SNR inherent to the use of the 
Hadamard techniques with a multiplicative photon noise. We 
conclude that the use of compressive sensing is strongly recom- 
mended for a successful application of Hadamard techniques to 
spectro-polarimetry. 

5.2. Sub-Nyquist Spectrograpin 

When prior information about the expected signals is available, 
the possibility of a spectrograph sampling it is possible to think 
of a spectrograph able to measure the wavelength variation of the 
Stokes parameters using resolution elements (pixels in the cam- 
era) larger than the spectral sampling. In such a case, if one mea- 
sures with a camera containing «pix pixels, each one integrating 
k spectral sampling steps, the sensing matrix of size n^n x ktir^w 
can be written: 



0,y = 



1 : ki+ \ < i < k(i + 1) 







otherwise. 



(8) 



This sensing matrix is probably not very efficient for reducing to 
the optimal value the number of CS measurements but it suffices 
for our aims, since we are typically interested in cases where 
k is not very large. An example of this is shown in Fig. [8] A 




6301.0 6301.5 630.2,0 6302.5 6303.0 

Wavelength [.A] 

Fig. 8. Example showing how reliable reconstructions of high- 
resolution signals can be obtained from rebinned data, provided 
that the signal is considered to be compressible. The upper panel 
shows the original (black) and reconstructed profiles using 1/2 
(red) and 1 /4 (blue) of the original resolution. The data is recon- 
structed with 5 PCA eigenvectors of a universal basis set. The 
lower panel shows the measurements from which the reconstruc- 
tions are obtained. The original signal is shown in black, while 
red and blue lines show measurements rebinned to 1/2 and 1/4 
of the original resolution. 
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Fig. 9. Comparison between the measured Stokes V amplitude 
and the one measured using CS techniques with a Hadamard 
sensing matrix. The black dots show the comparison when only 
one measurement is done, while the red dots present the results 
using 8 measurements. 

Stokes V profile observed with Hinode, shown in black lines in 
both panels is rebinned to 1 /2 and 1 /4 the original resolution by 
adding two/four consecutive pixels together. The corresponding 
measurements are shown in the lower panel with red and blue 
lines, respectively. Using five PCA eigenvectors of the universal 
set discussed in section [3T2l the signals are reconstructed solv- 
ing the ii optimization problem. The reconstructed signals are 
shown in red and blue lines in the upper panel of Fig. [8] corre- 
sponding to 1 /2 and 1 /4 of the original resolution, respectively. 

We point out that, if the signal is known to be sparse in the 
Fourier basis, one could consider that the Nyquist-Shannon theo- 
rem should be applied to the frequency support where the signal 
is defined in the frequency domain. This is the case of signals 
for which the power associated with frequencies above a certain 
threshold are associated to noise. In such a case, one can use 
this new threshold, using the Nyquist-Shannon sampling theo- 
rem, to estimate the number of pixels per resolution element. 
Reconstruction should be done using the appropriate prior in- 
formation. We have verified, although not shown here, that good 
reconstructions can be obtained using 1/2 and 1 /4 of the original 
information by employing the Fourier basis set and forcing the 
signal in the Fourier domain to be sparse. This has the advantage 
over PCA eigenvectors that they are fully universal. 

5.3. Hadamard-PCA Magnetometer 

The combination of CS ideas and standard PCA techniques can 
be applied to develop an efficient magnetometer. To this end, let 
us assume that the Stokes V profile is well represented using the 
first PCA eigenvector vi(A) obtained empirically from a set of 
previous observations: 

ViA)^Avi(A). (9) 

This equation assumes, therefore, that the observed Stokes pro- 
files are 1 -sparse in the PCA basis set, so that one can recover 
the signal, according to the CS theory, using of the order of 4-6 
measurements. Using a sensing matrix O, we end up with the 
following measurement process: 

yi = •^J'^(^v)' = 1' ■ ■ ■ '^--^ (10) 
J 
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If the sparsity constraint is used, the proportionality constant A 
can be obtained from the observations with a linear fit: 

A^^iM, (11) 

where fi = Yjj'^ji'^i(^j) Note that a similar result could have 
been obtained if we measure projections of the data over the 
PCA eigenvectors. However, this presents the difficulty of its 
practical implementation because the PCA eigenvectors contain 
negative values. Measuring with sensing matrices like a binary 
(1/0) Hadamard matrix that is incoherent with the PCA eigen- 
vectors leads to a universal sensing process. We point out that 
only one measurement gives a rough estimation of the magne- 
tometer signal although it should be used with care. An example 
of the capabilities of the method is shown in Fig. |9l A scan of a 
sunspot obtained on 27 February 2007 with the SOT/SP onboard 
Hinode has been used. The vertical axis presents the value of the 
Stokes V profile at a fixed wavelength corresponding to the blue 
cr component of the 6302.5 A line. The horizontal axis presents 
the value inferred from the CS measurements. The black dots 
show the scatter when only one measurement is done, while the 
red dots show what happens when 8 measurements are done. 

We point out that, if enough measurements are taken, it is 
possible to include more eigenvectors in the decomposition of 
Eq. (|9]l and obtain information about the projection along them 
from the observations. If Stokes / is decomposed, continuum 
images and velocities can be inferred from the first and second 
eigenvectors, respectively. Likewise, if Stokes V is used, mag- 
netic flux and magnetic velocities can be inferred. 

6. Conclusions 

This paper demonstrates the feasibility of applying compressive 
sensing techniques for measuring the wavelength variation of 
the Stokes parameters observed in stellar atmospheres. We have 
shown that spectro-polarimetric signals are, in general, com- 
pressible on universal basis sets. However, in general, it is more 
advantageous to use empirical basis sets like that obtained from 
principal component analysis because data can be more effi- 
ciently reproduced on such a basis. We stress that the results 
presented in this paper are extensible to standard spectroscopic 
observations and not only to linear and circular polarization pro- 
files, so that any day-time and night-time spectrograph can take 
advantage of these techniques. 

According to our results, it is possible to measure Stokes pa- 
rameters using less than a half of the measurements one should 
carry out when strictly applying the Nyquist-Shannon sampling 
theorem. Compressing sensing leads to several interesting ef- 
fects. Since less measurements are made, an inherent reduction 
in the exposure time is present. Such reduction can be used to 
do more measurements in the same total time, thus allowing less 
noisy observations. The a-priori information encoded in the spar- 
sity condition results in the fact that the reconstructed signal is 
much less noisy than one should expect. The reason is that fil- 
tering is applied simultaneously while measuring. For instance, 
several of the examples shown in this paper produce signals that 
are automatic ally filtered with the principal component analysis 
appUed bv Martinez Gonzalez et al.l (l2008bllah . 

We have proposed potential applications of CS to the field of 
spectro-polarimetry, testing them numerically on real data. Some 
of these techniques can be straightforwardly applied to existing 
instruments, while other proposals need more profound modifi- 
cations. The future of observational spectro-polarimetry, at least 
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in solar physics, has to be rooted on the development of two- 
dimensional spectro-polarimeters. Since detectors are only two- 
dimensional at the moment, scanning schemes have to be used. 
We consider that double-pass substractive spectroscopy consti- 
tutes a very appealing technique for two-dimensional spectro- 
polarimetry if combined with multiplexing techniques using 
Hadamard masks. Compressive sensing will help reduce signif- 
icantly the total exposure time, thus allowing an increase in the 
final SNR for a fixed integration time. 

Because of the natural physical interpretation of projec- 
tions of_flie_observed_Stokesp along PCA eigenvec- 
tors jSku manich & Lo pez Arista |2002|) . it would be desirable 
to directly measure such projections. Using compressive sens- 
ing techniques, we have proposed a technique that, thanks to 
Hadamard masks, is able to retrieve such projections from an 
universal multiplexing. The advantage is that this multiplexing 
is binary and easy to build. 

Finally, we analyze the plausibility of a spectro-polarimeter 
that does not fulfill the Nyquist-Shannon sampling theorem. 
Some a-priori information about the expected signals is avail- 
able and compressive sensing techniques can take full advantage 
of this information for recovering the signals from a reduced set 
of measurements. We show with an example that it is possible to 
reconstruct Stokes profiles using the information obtained from 
adding the signal in consecutive pixels. In some sense, this can 
be understood as a super-resolution scheme in which one knows 
the basis set in which the high-resolution signal can be efficiently 
developed. 

Acknowledgements. We thank Rafael Manso Sainz and Maria Jesus Martinez 
Gonzalez for illuminating discussions and carefully reading the manuscript. 
Financial support by the Spanish Ministry of Education and Science through 
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Appendix A: Compressive sensing 

As noted in the main text, the multiplexing scheme for a sparse 
signal reads: 

y = OW^x + e, (A.l) 

with the condition that x is sparse. When the number of measure- 
ments is much smaller than the size of the signal, the sparsest 
solution fulfills: 

min II X llo subject to || y - OW^x ||2< e, (A.2) 

X 

where || x ||o is the £0 pseudo-norm that equals the num- 
ber of non-zero elements of the vector x. Since solving this 
pr oblem is, in general, not feasible, it has been demonstrated 
by ICandes et"aLl (l2006bllal) that, if the matrix OW^ fii lfiUs the 
Restricted Isometry Property (RIP: ICandes et al.ll2006ah . the so- 
lution to the problem 

min II X 111 subject to || y - OW^x ||2< e, (A.3) 

X 

is equivalent to that of Eq. jA.lj . We note in passing that the RIP 
condition is a sufficient condition and it is often too restrictive. 
The advantage of the last problem is that it can be easily solved 
using linear programming techniques. 

Intuitively, the RIP condition states that the action of the 
operator on the sparse vector x does not modify exces- 
sively its £2 norm. Mathematically: 

(1-6k)\\x\\1 < llOW^xlg < (l+5;,)||x|i (A.4) 
for all ^T-sparse vectors x and 6k < ^■ 



In spite of the mathematical importance of the RIP condi- 
tion, it is more intuitive to think on terms of coherence between 
the sensing matrix and the transformation matrix. In general, for 
a sensing matrix to be considered good, it should be as incoher- 
ent as possible with the transformation matrix. Every row of the 
sensing matrix should be able to obtain as much information as 
possible from the sparse vector x in order to facilitate its recon- 
struction with as few measurements as possible. This is achieved 
when the sensing matrix and the transformation matrix are as in- 
coherent a s possible. The coherence b etween the two matrices is 
defined as jCandes & Rombergl2007h : 

ju(0,W)= max \{<p,w)\, (A.5) 

where and w are, respectivel y, columns and ro ws of the matri- 
ces 4> and W. Quite generally jCandes & Romb erg 2007), for a 
sensing matrix of size NxM, it is possible to recover an i-sparse 
vector using a number of linear combinations that fulfills: 

M > Cyu(0, WfK log N, (A.6) 

where C is a constant of order 1 . According to this equation, if 
one is able to find sensing matrices with small coherence with 
respect to the basis set of interest, one should be able to recover 
the sparse signal using a number of measurements that is pro- 
portional to KlogN. As we have shown in the main text, the 
proportionality constant is typically between 4 and 6. 

A.1. Recovery algorithm 

For the recovery problem in our experiments, different methods 
have been developed during the last few years. After testing sev- 
eral methods, we found that the recent algorithm presented by 
jBlumensath & Daviesll2008l) is very efficient in terms of com- 
puting time and shows state of the art performance. The method 
uses the simple iterative procedure: 

x"+' = H, [x" + yuWO^ (y - OW^x")] , (A.7) 

where //j(t) is a non-linear thresholding operator that leaves as 
non-zero the s elements of the vector t with the largest absolute 
value, setting to zero the rest of elements. The method is guaran- 
teed to be stable thanks to the re-scaling quantity fi. Indeed, ac- 
cording to our experience, its stability is remarkable, converging 
to the solution in almost all experiments. The method is initial- 
ized by x*^ = 0. The main drawback of the method (usually com- 
mon to all recovery methods) is that the sparsity of the solution, 
s, has to be chosen in advance. The advantage is that only multi- 
plications with the matrices O and W (and their transposes) are 
needed. Many sparsity -promoting basis sets are accompanied by 
fast multiplication algorithms (e.g., fast fourier transform, fast 
wavelet transform, etc.). In such a case, the computing time of 
the multiplication with the W and matrices scales as 0{n) for 
the fast wavelet transform and as 0(n log «) for the fast fourier 
transform, instead of scaling as 0{n^) like in a standard matrix- 
vector product. 
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Fig. 2. Test showing the reconstruction of three widely known domains of the second solar spectrum: the top panels show a region 
around the Ba ii D2, the middle panels a region around the Na i doublet at 5890 A while the lower panel presents the region around 
the Sr I line at 4607 A. Reconstruction using different wavelets and different thresholds are displayed. In each panel, the left panel 
presents how the reconstructed spectrum (red line) compares with the original spectrum using Daubechies-4, Daubechies-8, Coiflet- 
3 and Haar wavelets when 90% of the wavelet coefficients are set to zero. The right panels show the results when only 2% of the 
wavelet coefficients are retained. These results demonstrate that the structure of the lines is nicely recovered with such a few number 
of wavelet coefficients. Only in the case of the Sr i, we find spurious ripples (except in the Haar case) due to the loss of information. 
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Percentage of remaining non — zero coefficients Percentage of remaining non — zero coefficients 

Fig. 3. Percentiles 68% (lines) and 95% (lines+symbols) of the differences between the true signal and the reconstructed signal 
when retaining a different percentage of the wavelet coefficients. The left panel corresponds to the data close to the Ba n D2 line, 
while the right panel is focused on the reconstruction of the Na i doublet. 
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Fig. 4. Percentile 68% (lines) and 95% (lines+symbols) of the differences between the true signal and the reconstructed signal for a 
Hinode observation. The left panel shows the case using a Daubechies-8 wavelet, while the right panel presents the results using the 
PCA empirical basis set. The noise level in Stokes Q, U and V is close to 1.6x10"^ in units of the continuum intensity. Therefore, 
less than 5% of the PCA eigenvectors are needed in order to reconstruct the profiles to this noise level. 
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Fig. 5. Examples of CS reconstructions of four different polarization signals. The left panels present the results for signals produced 
by scattering polarization when observed close to the limb. Reconstruction is done using Daubechies-8 wavelets. The right panels 
show reconstructions of Zeeman signals observed with Hinode. Reconstruction is done using a universal PCA basis. Additionally, 
we show the difference on the reconstruction using two different sensing matrices: a Gaussian matrix with zero mean and inverse 
variance equal to the sparsity of the signal and a sparse binary sensing matrix with 10 non-zero elements per measurement. In 
general, we find a better behavior for the sparse binary matrix than for the random full matrix. 
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Fig. 7. Noise level of the CS reconstructed signal versus the noise level of each individual measurement. The left panel shows the 
reconstruction with a wavelet basis set while the right panel has been obtained using a universal PCA basis set. The sparsity in the 
wavelet case is assumed to be 12, while this number is reduced to 5 for the PCA case. The dashed line shows the reconstruction 
error when only noise is taken into account. The dotted line is the diagonal, the expected noise level if a standard spectrograph is 
used. For comparison, the maximum amplitude of the signal is 4x10"'*, so that a noise level of 10"* gives a signal-to-noise ratio of 
4. 



